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SPECTRAL DIAGONAL ENSEMBLE KALMAN FILTERS 

IVAN KASANICKY*, JAN MANDEL**, AND MARTIN VEJMELKA* 


Abstract. A new type of ensemble Kalman filter is developed, which is based on replacing the sample covariance 
in the analysis step by its diagonal in a spectral basis. It is proved that this technique improves the aproximation of the 
covariance when the covariance itself is diagonal in the spectral basis, as is the case, e.g., for a second-order stationary 
random held and the Fourier basis. The method is extended by wavelets to the case when the state variables are 
random fields which are not spatially homogeneous. Efficient implementations by the fast Fourier transform (FFT) 
and discrete wavelet transform (DWT) are presented for several types of observations, including high-dimensional 
data given on a part of the domain, such as radar and satellite images. Computational experiments confirm that the 
method performs well on the Lorenz 96 problem and the shallow water equations with very small ensembles and over 
multiple analysis cycles.. 


1. Introduction. Data assimilation consists of incorporating new data periodically into com 
putations in progress, which is of interest in many fields, including weather forecasting (e.g., Kalnay 


2003 


Lahoz et ah, 2010). One data assimilation method is filtering (e.g., Anderson and Moore 


1979), which is a sequential Bayesian estimation of the state at a given time given the data re¬ 


ceived up to that time. The probability distribution of the system state is advanced in time by a 
computational model, while the data is assimilated by modifying the probability distribution of the 
state by an application the Bayes theorem, called analysis. In the methods considered here, data is 
assimilated in discrete time steps, called analysis cycles, and the probability distributions are repre¬ 
sented by their mean and covariance (thus making a tacit assumption that they are at least close to 
gaussian). When the state covariance is given externally, bayesian estimation becomes the classical 
optimal statistical interpolation (OSI). The Kalman filter (KF) uses the same computation as OSI 
in the analysis, but it evolves the covariance matrix of the state in time along with the model state. 
Since the covariance matrix can be large, the KF is not suitable for high-dimensional systems. The 


ensemble Kalman filter (EnKF) (Evensen 2009) replaces the state covariance by the sample covari¬ 
ance computed from an ensemble of simulations, which represent the state probability distribution. 


It can be proved that the EnKF converges to the KF in the large ensemble limit (Kwiatkowski and 


Mandel, 

2014 

Le Gland et ah 

2011; 

Mandel et ah, 

2011) 


2011) in the gaussian case, but an acceptable 


approximation may require hundreds of ensemble members (Evensen, 2009), because of spurious 


long-distance correlations in the sample covariance due to its low rank. Localization techniques 


(e.g-, 

Anderson 

2001 

Furrer and Bengtsson, 

2007 

Hunt et ah, 

2007 

), essentially suppress long- 

distance covariance terms (Sakov and Bertino 

2010 

, which improves EnKF performance for small 


ensembles. 


FFT EnKF (Mandel et ah, 2010ajb) was proposed as an alternative approach to localization, 


based on replacing the sample covariance in the EnKF by its diagonal in the Fourier space. This 
approach is motivated by the fact that a random field in cartesian geometry is second order station¬ 
ary (that is, the covariance between the values at two points depends only on their distance vector) 
if and only if its covariance in the Fourier space is diagonal (e.g., Pannekoucke et al., 2007). On a 


sphere, an isotropic random field has diagonal covariance in the basis of spherical harmonics ( Boer[ 
1983), so similar algorithms can be developed there as well. However, the stationarity assumption 
does not allow the covariance to vary spatially. For this reason, the FFT EnKF was extended 


to wavelet EnKF (Beezley et ah, 2011). The use of wavelets results in an automatic localization, 


which varies in space adaptively. For wavelets, the effect of the diagonal spectral approximation is 


1 Institute of Computer Science, Academy of Sciences of the Czech Republic 
department of Mathematical and Statistical Sciences, University of Colorado Denver 


1 






















































































equivalent to a weighted spatial averaging of local covariance functions (Pannekoucke et ah, 2007). 


Diagonal matrices are cheap to manipulate computationally, but implementing the multivariate 
case and general observation functions is not straighthforward. 

Diagonal spectral approximation and, more generally, sparse spectral approximation, have been 
used as a statistical model for the background covariance in data assimilation in meteorology for 


some time. The optimal statistical interpolation system from Parrish and Derber (1992) was based 


on a diagonal approximation in spherical harmonics, already used as horizontal basis functions in 
the model, and a change of state variables into physically balanced analysis variables. The ECMWF 


3DVAR system (Courtier et al., 1998) also used diagonal covariance in spherical harmonics. Diag¬ 


onal approximation in the Fourier space for homogeneous 2D error fields, with physically balanced 
crosscovariances, was proposed in Berre (2000). The Fourier diagonalization approach was extended 
by Pannekoucke et al. (2007) to sparse representation of the background covariance by thresholding 


wavelet coefficients, and into a combined spatial and spectral localization by Buehner and Charron 


(2007). 


While modeling of background covariances typically uses multiple sources including historical 
data, the EnKF builds the covariance in every analysis cycle from the ensemble itself. In this paper, 
we prove that replacing the sample covariance by its spectral diagonal improves the approximation 
when the covariance itself is diagonal in the spectral space, as is the case, e.g., when the state is a 
second order stationary random held and a Fourier basis is used. The result, however, is general and 
it applies to an arbitrary orthogonal basis, including wavelets. We also develop computationally 
efficient spectral EnKF algorithms, which take advantage of the diagonal form of the covariance, in 
the multivariate case and for several important classes of observations. We demonstrate the methods 
on computational examples with the Lorenz 96 system and shallow water equations, which show 
that good performance can be achieved with very small ensembles. 

2. Notation. Vectors in W 1 or C n are typeset as u and understood to be columns. Random 
vectors are typeset as X. The entry i of X is denoted by (A)^. Matrices (random or deterministic) 
are typeset as A, and and A* is the transpose, or conjugate transpose in the complex case. The 
entry i, j of matrix A is denoted by (A) i ■ or atj, and A = [ai,..., a n ] is the writing of a matrix 
as a collection of columns. Nonlinear operators are typeset as Ai. The mean value is denoted by 
E [•], and Var is the variance. N ( 0,1) is the normal (gaussian) distribution with zero mean and 
unit variance, and N (m, C) is the multivariate normal distribution with mean m and covariance 

C. The Euclidean norm of a vector is llttll = ( W"_, Iml ) . The Frobenius norm of a matrix is 


l_A.ll_ ( 


_ / V—v m \ - tl 

— I 2si=i 2^j= 


l i 


1/2 


3. Kalman filter and ensemble Kalman filter. The state of the system at time t is de¬ 
scribed by a random vector Xf of length n. The system evolution between two times t\ and t 2 is 
given by a function A4(.,t\,t2), so that 




(3.1) 


The goal of the Kalman filter (KF) (Kalman, 1960) is to correct the forecast state of the system X\ 
to obtain the analysis estimate Xf of the true state X t , given noisy observations Y t = H t X t + et, 
where HR is an observation operator, i.e., a mapping from state space to a data space, and et ~ 
N (0, Rt). When the distributions of the state Xt and the data error are gaussian, the analysis 
satisfies 


Xf = X\- QH t * (H f CfH* + R t 


,-i 


H t X\ - Yt 


(3.2) 




























where C t is the covariance of the forecast X\. In the KF, the state is represented by its mean and 


covariance, and the mean is transformed also by (3.1) and (3.2). In the rest of the paper, we will 


drop the time index t and the superscript f, unless there is a danger of confusion. 


In the EnKF, the analysis formulas (3.1) and (3.2) are applied to each ensemble member, with 
the covariance replaced by the sample covariance from the ensemble. The resulting ensemble, 
however, would underestimate the analysis covarian ce, which is co rrected by a data perturbation 
by sampling from the data error distribution (Burgers et ah, 1998). Denote by X 1 ,... ,X N the 


forecast ensemble, created either by a perturbation of a background state or by evolving each 


analysis ensemble member from the previous time step independently by (3.1). Then, the analysis 
ensemble members are 

-l 


X a ’ 3 = - C^H* (HC iV H* + R 


iV i 


where the sample covariance matrix is 

1 N 

C " = WTT E ( X ‘ - X) ( Xl * *)* • * = 

3 =1 


- Y 3 ) , 

(3.3) 

. N 

Yx 3 

N 

3 = 1 

(3.4) 


and Y 3 = Y + t 3 are the perturbed observations, with t 3 ~ N (0, R) independent. 

The advantage of the EnKF update formula (3.2) is that it can be implemented efficiently 
without having acces to the whole sample covariance matrix C‘ v . On the other hand, the rank of 
matrix is at most IV—1, and, in the usual case when N « n, the low rank of the approximation 
of the true forecast covariance C is the biggest drawback of the EnKF. 

4. Spectral diagonal EnKF. Let F be an orthonormal transformation matrix, which trans¬ 
form each ensemble member to spectral space, and denote each transformed ensemble member by 
the additional subscript F, Xp = FX 3 , j = 1,,1V. Since the transformation is orthonormal, 
the inverse transformation is F*, so F*XTp = X 3 for each j = 1 , ,N. The columns of the inverse 
transform matrix F* are the spectral basis elements u\,... ,u n , i.e., F = [tti, ... ,u n ]*. We will 
also denote the sample covariance of the transformed ensemble with the additional subscript F, 


n N _ 

'—p — 


1 


N 


N - 1 


1 


N 


E (X> T -X F )(x’ r -X F ) — FC^F*, X F = -Y,K- 


(4.1) 

3 =1 3 =1 

The idea of the spectral diagonal Kalman filter is to replace the sample covariance in the update 


formula (|3.3|) by only the diagonal elements of sample covariance in spectral space, 

ci,i 0 

0 C2,2 


= c£oi, 


0 


0 


0 

0 ( 'n.n 


C-iA — 


1 


N 


T E pit-(**■), 

3= 1 


(4.2) 


where o stands for Schur product, i.e., element-wise multiplication. The entries Ci^ are the sam¬ 
ple variances, computed without forming the whole matrix Cp. The diagonal approximation is 
transformed back to physical space as 


= F*Dp F, 

and the proposed analysis update is then 

X^J = x 3 - D^H (HD W H* + R)” 1 (HX J - Y 3 ) 
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(4.3) 

(4.4) 


















5. Error analysis. We will now compare the expected errors of the sample covariance and 
its spectral diagonal approximation (4.1). Assume that the ensemble members X 1 ~ N (X, C) are 
independent, and the columns of the inverse spectral transformation F* are eigenvectors Ui of the 
covariance C with the corresponding eigenvalues A*, 


F = [mi, 




Cuj = A iUi, FF* = I. 


(5.1) 


Equivalently, in the basis {«i,..., u n }, the covariance FCF*of FX 1 is diagonal, with the diagonal 
elements A,. This is the situation, e.g., when X 1 are sampled from a second-order stationary random 
field on a rectangular mesh, and itj is the Fourier basis. In the EnKF, the ensemble members after 
the first analysis cycle are not independent, because the sample covariance in the analysis step 
ties them together, but they converge to independent random vectors as the ensemble size N —> oo 
(Le Gland et al., 2011 Mandel et al., 2011 ). The following theorem shows that the spectral diagonal 
approximation has smaller expected error than the sample covariance, in Frobenius norm. 

Theorem 5.1 (Error of the spectral diagonal approximation). Let X k r^j N(X, C), k = 
1,... ,1V ; be independent, and the transformation F s atisfy (5.1). Then, the expected squared errors 
in th e Fr obenius norm of the sample covariance C N (3.f) and its spectral diagonal approximation 
D w (4-3) are 


E [||C — C 


AT||21 

f 


E 


[C-D^lH] 


1 

j 

(5.2) 

2— 1 

n 

EE 

2— 1 

i,j= 1 

(5.3) 


Proof. Without loss of generality, assume that X = 0. The Frobenius norm of a square 
matrix A = [a\,... ,a n ] is unitarily invariant, ||FAF*||p = ||A||p, because ||FA||p = 


2: iif-.ii 2 = 
1=1 




I 2 - IIAII 2 - 
! — H a Hf — 


Thus, 


i= 1 


E 


[C-C^ll 2 ] 


= E [||C F — C 


TV || 2 1 

F HfI 


X] E [(C F )jj — (Cp)jj] 

i,j =i 


n 

E 

hi= 1 


^F )ij\ 


iN\ 


= (Cf) i j- Lemma 


Var [(C{ 

in the Appendix 


A.3 


because the sample covariance is unbiased, E [(Cp jij j 
now gives (5.2). To prove (5.3), we consider the diagonal entries in the spectral domain, 

N 


E 


[C-D^H 2 ] 


= E 


IFl^l 2 

WF - -L>f If 


Ee 

i =1 


(C F );j — (Cf)v 


n 


= VVar(C^) M 


2—1 


and use Lemma A.3 


again. 


Since the eigenvalues of covariance are always nonnegative, we have AjA j > 0, therefore the 
spectral diagonal covariance decreases the expected squared error of sample covariance: 


E 


IC-D^Hl] < E 


ic-c^n 2 ], 


with equality only if all A*A j = 0, i / j, that is, only in the degenerate case when the exact 
covariance C has rank at most one. To compare the error terms further, note that (X^Li = 

4 
































Ejj'=l AjAj ~ Yli,j=l,i^j AjA j + E”=i Af, which shows that the error of the sample covariance 
depends on the d 1 norm of the eigenvalues sequence, 


E[||C — C N \\ 2 ] 


1 


N - 1 




1 


N — 1 


l = A + \\{\ k } n k=1 \\l 


while the error of the spectral diagonal approximation depends only on the d 2 norm, 


EHIC-D 


iV 1121 
F 


N- 1 




which is weaker as the state dimension n —> oo. The improvement depends on the rate of decay 
of the eigenvalues as the index k —> oo. Note that the eigenvalues of the covariance (if it exists) 
of a random element in an infinitely dimensional Hilbert space must satisfy the trace condition 
Da Prato (2006|). The eigenvalues of the covariance in many physical systems 


EfcLi Afc < OO, e.g. 


obey a power law, A*, 
and n —> oo. Then, 


k “ with a > 1, e.g., Gaspari and Cohn (1999). Suppose that A*, = ck 


\\{^r k=1 \\% 


\\M n k=1 \\l 


_ oo 

OO n, 

Y k~ 2a « / x~ 2a dx = 

k= 1 ^ 

^ oo 

oo « 

/ 

i ^ 


1 


x a dx = 


2a - 1’ 

1 


a- 1’ 


which gives the error ratio E [||C — D A |||] / E [||C — C jV ||p] —> 0 as a —> 1 + . Other considerations 


of similar ratios can be found in Furrer and Bengtsson (2007). Theorem 5.1 is related to but 
different from the estimate in Furrer and Bengtsson (2007, eq. (12)), which applies to the case 


when the mean known exactly rather than the sample covariance here. Also, the analysis in Furrer 


and Bengtsson (2007) is in the physical domain rather than in the spectral domain. 


6. Spectral EnKF algorithms. We will show that the analysis step can be implemented 
very efficiently in cases of practical interest. We drop the ensemble members index in all update 
formulas to make them more readable. Note that when using all the following formulas, it is 
necessary to perturb the observations. 

6.1. State consisting of only one variable, completely observed. Assume that the state 
consists of one variable, e.g., X E M n , and that we can observe the whole system state, i.e., the 
observation function is the identity, H = I, and observations are Y E M n . Assume also that the 
observation noise covariance matrix is cl, where c > 0 is a constant. In this special case, we can 
do the whole update in the spectral space, since it is possible to transform the innovation to the 


spectral space, and the analysis step (4.4) becomes 


X a = X - F*Dp (Dp + cl) 1 F (X - Y) 


Note that the matrices Dp and Dp + cl are diagonal, so any operation with them, such as 
inversion or multiplication, is very cheap. The matrix F is never formed explicitly. Rather, the 
multiplications of F and F* times a vector are implemented by the fast Fourier transform (FFT) or 

2 0 l()a 1 b[ ) 


discrete wavelet transform (DWT). This is the base case of the FFT EnKF (Mandel et al. 


and the wavelet EnKF (Beezley et al. 2011), respectively. 
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6.2. Multiple variables on the same grid, one variable completely observed. In a 

typical model, such as numerical weather prediction, the state consist usually of more than one 
variable. Assume the state consist of m different variables all based on the same grid of length n. 
Then each variable can be transformed to the spectral space independently, and we have the state 
vector X E M n ' m and the transformation matrix in the block form 



'Xi 


F 

0 

... o' 

X = 

Xo 

, F = 

0 

F 

0 


X m _ 


0 


0 F 


where each block X\ is a vector of length n and F is n by n transformation matrix. 

Assume also that the whole state of the first variable X\ is observed, and again the covariance 
of observation error is cl. In this case, the observation operator is one by m block matrix of the 
form H = [I 0 • • • 0]. In the proposed method, we approximate the crosscovariancess between 

the variables also by the diagonal of the sample covariance in spectral space, Dp = [D^]’”_ 1 , 
where D ij is matrix containing only diagonal elements from the sample covariance matrix between 
transformed variables FA; and FA;. With this notation, the analysis step (4.4) becomes 


"xf 


'Xi 


'F*Df,' 

1 

.. 

1_ 


Xm_ 


F*D^i_ 


(Dip + cl) ^(Ai-Y). 


( 6 . 2 ) 


Note that again the matrix to be inverted is diagonal and full-rank, and the transformation F is 
implemented by call to FFT or DWT, so the operations are computationally very efficient. A related 
method using interpolation and projection was proposed for the case when the model variables are 


defined on non-matching grids (Beezley et al.. 2011) 


6.3. Multiple variables on the same grid, one variable observed at a small number 
of points. This situation occurs, e.g., when assimilated observations are from discrete stations. In 
this case, the observation matrix is H = [Hi 0 • • • 0], where Hi has a small number of rows, 

one for each data points, and X and F are the same as in Eq. (6.1). We substitute the diagonal 
spectral approximation into the analysis step (4.4) directly, and (|6.2[) becomes 


X a = 


'Xi 


"F*D(y 

X m 




HiF*Df 1 FH^ + R 


-l 


'F(Xi-V) 


(6.3) 


The solution of a system of linear equations with the matrix HiF*D{ v 1 FH^ + R in (6.3) does 


not present a problem, because its dimension is small by assumption, and FHJ is easy to compute 
explicitly by the action of FFT on the columns of H). Note that in this case, the data noise 
covariance R may be arbitrary. 


6.4. State consisting of more variables, one partly observed. Consider the situation 
when the number of observation points is too large for the method of Sect. |6.2| to be feasible, 
but only one variable on a contiguous part of the mesh is observed. The typical example of this 
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type may be radar images, which cover typically only a part of domain of the numerical weather 
prediction model. 

Suppose that observations (Y") ■ of the values of the first variable (Xi) ■ are available only for 
a subset of indices j G M C m }. Augment the forecast state by an additional variable 


Xq. For j = 1 set (X 0 ) j = (Xi )j if j G M, (X 0 ) j = (Y) J = 0 if j £ M. We can now 

use the analysis update (6.2) with the augmented state X = (Xq, Xi, ..., X m ) and observation 


Y = (Y, 0,..., 0), to get the augmented analysis X = (Xq, X\, ..., X and drop Xq. 

Note that the innovations to the original variables are propagated through the spectral diag¬ 
onal approximation of cross covariance between the original and augmented variables. Since this 
covariance is not spatially homogeneous, a Fourier basis will not be appropriate, and computational 
experiments in Sect. [ 7 ] confirm that wavelets indeed perform better. 

7. Computational experiments. In all experiments, we use the usual twin experiment ap¬ 
proach. A run of the model from one set of initial conditions is used to generate a sequence of 
states, which plays the role of truth. Data values were obtained by applying the observation opera¬ 
tor to the truth; the data perturbation was done only for ensemble members within the assimilation 
algorithm. A second set of initial conditions is used for data assimilation and for a free run, with 
no data assimilation, for comparison. The error of the free run should be an upper bound on the 
error of a reasonable data assimilation method. 

We evaluate the filter by the root mean square error, RMSE = Ya =i ((^0?. — (^ a )J 2 ) 1//2 , 
where X° is the analysis ensemble mean, X is the true state, and n is the number of the grid 
points x l . In the case when the state consist of more than one variable, such as in the shallow 
water equations, we evaluate the error of each variable independently. While the purpose of a 
single analysis step is to balance the uncertainties of the state and the data rather than minimalize 
the RMSE, the RMSE values over multiple time steps are used to evaluate how well the data 
assimilation fulfills its overall purpose to track the truth. 

We evaluate the RMSE of the the standard EnKF, marked as EnKF in the legend of the figures, 
and the spectral diagonal EnKF with the discrete sine transform, discrete cosine transform, and 


the Coiflet 2,4 discrete wavelet transform (Daubechies, 1992), marked as DST, DCT, and DWT, 
respectively. 


7.1. Lorenz 96. In the Lorenz 96 model (Lorenz, 2006), the state consists of one variable 
X t G , X. t — (x^, • • •, xj^), governed by the differential equations 


dxj 

dt 


Xj-iXj+i - Xj-iXj -2 — Xj + F, j = 1,..., K, 


where the values of xj_k and Xj + k are defined to be equal to Xj for each j = 1 .... ,K, and F 
is a parameter. We set the parameter F = 8, which causes the system to be strongly chaotic. 
The timestep of model was set to 0.01 s and the analysis cycle was 1 s. The data covariance was 
diagonal, with diagonal entries equal to 0.04. The ensemble and the initial conditions for the truth 
were generated by sampling from 1V(0.0005,0.01). The the ensemble and the truth were moved 
forward for 10 second, then the assimilation starts. 

In the case when the whole state is observed, spectral filters with ensemble size N = 4 (Fig. 0*) 
already decrease the error significantly compared to a run with no assimilation, while the standard 
EnKF actually increases the error. For all filters, the error eventually decreases with the ensemble 
size at the standard rate JV -1 / 2 , but spectral EnKF shows the error decrease from the start, while 
the EnKF lags until the ensemble size is comparable to the state dimension, and even then its 
RMSE is significantly higher (Fig. 
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Ensemble size 


(a) 


(b) 


Fig. 7.1. Mean RMSE from 10 realizations for Lorenz 96 problem, the whole state observed, state dimension 256 
(a) increasing analysis cycles with ensemble size 4, (b) increasing ensemble size, analysis cycle 1. 



Assimilation cycle 

(a) 



(b) 


Fig. 7.2. Mean RMSE from 10 realizations for the Lorenz 96 problem, ensemble size 16, state dimension 256. 
(a) first 128 points observed, (b) first 64 points observed. 


Next, consider the case when only the first m points of a grid are observed. In the legend, 
DCT-S and DWT-S are the method with the discrete cosine transform, and the Coiflet 2,4 discrete 
wavelet transform, respectively, with the standard analysis update (4.4), while DCT-A and DWT- 
A use the augmented state method from Sect. (h4 Figure E2 shows that the spectral diagonal 
method decrease the RMSE, while the standard EnKF is unstable. This observation is consistent 
with the result of Kelly et al. (2014), which shows that, for a class of dynamical systems, the EnKF 



























remains within a bounded distance of truth if sufficiently large covariance inflation is used and if the 
whole state is observed. The augmented state method DWT-A with wavelet transformation gave 
almost the same analysis error as DCT-S, which is using the spectral diagonal filter with the exact 
observation matrix, while the cosine basis, which implies a homogenenous random field, resulted 
in a much larger error (method DCT-A). A similar behavior was seen with a smaller number of 
observed points as well, but the error reduction in spectral diagonal EnKF was smaller (not shown). 

7.2. Shallow water equations. The shallow water equations can serve as a simplified model 
of atmospheric flow. The state Y = (h, u , v) consists of water level height h and momentum u, v in 
x and y directions, governed by the differential equations of conservation of mass and momentum, 


d(hu) 

dt 

d(hv) 

dt 


dh d(uh ) d(vh ) 

dt dx dy 


d 


1 


+ cTT ( hu2 + 


+ 


dx 
d(huv ) 

dx 


+ 


d(huv) 


dy 


d 


1 


+ —\hv + - gh' 


= 0, 
= 0 , 
= 0 , 


where g is gravity acceleration, with reflective boundary conditions, and without Coriolis force or 
viscosity. The equations were discretized on a rectangular grid size 64 x 64 with horizontal distance 
between grid points 150 km and advanced by the Lax-Wendroff method with the time step 1 s. 
The initial values where water level h = 10 km, plus Gaussian water raise of height 1 km, width 
32 nodes, in the center of the domain, and u = v = 0. See Moler (2011, Chapter 18) for details. 

We have used two independent initial conditions, one used for the truth and another for the 
ensemble and the free run. The only difference was the location of the initial wave. Both states 
were moved forward for 4 hours. Then the ensemble was created by adding random noise (with 
prescribed background covariance). Then, all states were moved forward for another hour, and 
assimilation starts 5 h after the model initialization. All assimilation methods start with the same 
forecast in the first assimilation cycle. 

The background covariance for initial ensemble perturbation was estimated using samples taken 
every second from time i star t = 4 h to time t e nd = 6 h, and modified by tapering the sample 
covariance matrix Cat as B = Cjv ° T, where the tapering matrix T had the block structure 


A 

0 

o' 


0 

A 

A' 

0 

A 

0 

+ 0.9 

A 

0 

A 

0 

0 

A 


A 

A 

0 


where the entry between nodes ( i a ,j a ) and ( ib,jb ) is A a) b = exp(— \i a — ib\) exp(— \j a —jb\)- 2D tensor 
product FFT and DWT were used in the diagonal spectral EnKF. The observation error was taken 
with zero mean and variance 1000 m 2 in h and 1000 kg m s -1 in u and v. The forecast ensemble was 
created by adding random noise with the covariance B 4 h after the model initialization. To relax 
the ensemble members, the model was run for another hour before the assimilation started. So the 
first assimilation was performed 5 hours after the model initialization. After the first assimilation, 
another 4 assimilation cycles were performed every 60 s. 

When the full state is observed, the spectral diagonal method decreased the RMSE in all 
variables dramatically (Fig. 7.3), unlike the standard EnKF. When only the water level is observed, 
the RMSE in spectral diagonal EnKF decreases less, but still much more that in the standard EnKF 
(Fig. Ob. 
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Fig. 7.3. RMSE of ensemble mean of one realization of three assimilation cycles (f - forecast error, a - analysis 
error). Full state was observed. The length of assimilation cycle 60 second, ensemble size 20. (a) water height (b) 
momentum in the x direction (c) momentum in the y direction 



(a) (b) (c) 


Fig. 7.4. Mean RMSE of ensemble mean from 5 independent repetitions. Ensemble size 20, only water height 
observed, (a) water height (b) momentum in the x direction (c) momentum in the y direction 


8. Conclusions. A version of the ensemble Kalman filter was presented, based on replacing 
the sample covariance by its diagonal in the spectral space, which provides a simple, efficient, 
and automatic localization. We have demonstrated efficient implementations for several classes of 
observation operators and data important in applications, including high-dimensional data defined 
on a continuous part of the domain, such as radar or satellize images. The spectral diagonal was 
proved rigorously to give a lower mean square error that the sample covariance. Computational 
experimens with the Lorenz 96 problem and the shallow water equations have shown that the 
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method that the analysis error drops very fast for small ensembles, and the method is stable over 
multiple analysis cycles. The paper provides a new technology for data assimilation, which can 
work with minimal computational resources, because an implementation needs only an orthogonal 
transformation, such as the fast Fourier or discrete wavelet transform, and manipulation of vectors 
and diagonal matrices. Therefore, it should be of interest in applications. 
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Appendix A. Properties of sample covariance matrix. 

Let U k ~ A (0, C) be independent random vectors in M n or C n . For each U k . we have the 
Karhunen-Loeve decomposition 


U k = \] /2 0j, k Uj, 9j )k ~ A(0,1) independent. 


(A.l) 


3 =1 


were A j > 0 are the eigenvalues and Uj orthonormal eigenvectors of the covariance matrix C. Let 
F = [tii,..., u n ]* . By a direct computation, we have in the basis of the eigenvectors: 

Lemma A.l. The random vector U p = FU ~ A (0, Cp), where Cp = FCF* is a diagonal 
matrix with Ai,..., A n on the diagonal. 


Next, we use (A.l) to compute an expansion of the sample covariance entries. 


Lemma A.2. Let Cp be the sample covariance of Up,... ,U p, cf., (4-1). Then, 


( c f )«■ = (E - 4 E on E ** 

\fc=l Z=1 m=l 


A A 


(A.2) 


Proof. From the definition of the sample covariance, 
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( c f)m = V3t£( ui-u r ){ui-u r ) 

k= 1 
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A — 
1 

a^ 

(AjA 


fc=i 
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Z=1 


m=l 
A 


/A A A 

i iKW-siKsK. 

m=l 


A — 


. 3 A 

fc=l Z=1 

n1/2 / iV A A 

(j - ( ^ i ’ 1 ® J ’ m 

\fc=l Z=1 m=l 


Finally, we use the expansion (A.2) to derive the variance of the sample covariance entries. 
Lemma A.3. The variance of each entry of Cp is 


Var[(Cp )jj] = 


2A 

A—1 

Aj Aj 


*/ * = j, 

*/»+ j- 
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Proof. The sample covariance is unbiased estimate of the true covariance, so from Lemma A.2 

12 " r '~ v ' ) M ] 2 
N \ 1 2 

y (0i,k9i,i) - At 


Var [(Cp)y] = E [(C£) M - E [(C^) M ]] 2 = E [(C£) M - (C F ) M ] : 


= E 


(AjAi) 1/2 l y 1 
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AT (AT — 1 ) 

The random variables & are i.i.d., so it follows that 


+ A 


(A.3) 


E \di t k@i,l@i,m@i,n\ — * 


3 if k = l = m = 71, 

1 if k = l,m = n, k ^ m, 

1 if k = m, l = n, k ^ l, 

1 if k = n,l = m,k ^ l, 

0 otherwise, 


and we can compute all the expected values in Eq. (A.3), 
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Together, we get 


Var [(Cp)*,*] = A 2 


2 (N{N + 2) 2(N + 2) 
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The variance of the off-diagonal entry (C p)ij, i / j, is 


Var [{C$) id ] = E [(C^ - E [(C#)^]] 2 = E [(C^) 0 - - (C F )^-] 
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The integrals in Eq. (A.4) are 
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So, the variance of an off-diagonal element is Var [(Cf ),j] = (-V — 2 + 1) = 


A j A j 


(A.4) 
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